-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dataset downloading #7864
Fix dataset downloading #7864
Conversation
- prolong export cache lifetime on cache requests - change 500 to 404 on missing cache file during downloading - remove rq job removal after result retrieval - make export filenames more distinct - fix possible infinite deferring of dependent rq jobs
WalkthroughThe changes introduce new functionalities for handling concurrent export and cleanup processes in the CVAT application, alongside updates to package dependencies. Key additions include new classes and methods for multiprocessing, lock management, and export file handling. These improvements aim to enhance the reliability and efficiency of export operations. Additionally, multiple package versions have been upgraded to maintain compatibility and security. Changes
Sequence Diagram(s) (Beta)sequenceDiagram
participant User
participant CVATApp
participant ExportManager
participant LockManager
participant FileHandler
User->>CVATApp: Initiate export
CVATApp->>ExportManager: Start export process
ExportManager->>LockManager: Acquire lock
LockManager-->>ExportManager: Lock acquired
ExportManager->>FileHandler: Export data
FileHandler-->>ExportManager: Data exported
ExportManager->>LockManager: Release lock
LockManager-->>ExportManager: Lock released
ExportManager-->>CVATApp: Export completed
CVATApp-->>User: Export successful
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Out of diff range and nitpick comments (3)
cvat/apps/engine/views.py (3)
Line range hint
44-44
: Remove unused import to clean up the code.- import cvat.apps.dataset_manager.views # pylint: disable=unused-import
Line range hint
1966-1966
: Remove unnecessary f-string as it does not contain any placeholders.- server_address = request.scheme + '://' + server_address = str(request.scheme) + '://'
Line range hint
2713-2713
: Remove unnecessary f-string as it does not contain any placeholders.- server_address += request.get_host() + server_address = server_address + str(request.get_host())
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (10)
- cvat/apps/dataset_manager/apps.py (1 hunks)
- cvat/apps/dataset_manager/default_settings.py (1 hunks)
- cvat/apps/dataset_manager/util.py (2 hunks)
- cvat/apps/dataset_manager/views.py (4 hunks)
- cvat/apps/engine/views.py (4 hunks)
- cvat/requirements/base.in (1 hunks)
- cvat/requirements/base.txt (10 hunks)
- cvat/requirements/development.txt (3 hunks)
- cvat/requirements/production.txt (1 hunks)
- utils/dataset_manifest/requirements.txt (2 hunks)
Files skipped from review due to trivial changes (5)
- cvat/apps/dataset_manager/apps.py
- cvat/apps/dataset_manager/default_settings.py
- cvat/requirements/base.in
- cvat/requirements/development.txt
- utils/dataset_manifest/requirements.txt
Additional Context Used
Ruff (4)
cvat/apps/dataset_manager/views.py (1)
27-27:
.util.LockMode
imported but unusedcvat/apps/engine/views.py (3)
44-44:
cvat.apps.dataset_manager.views
imported but unused
1966-1966: f-string without any placeholders
2713-2713: f-string without any placeholders
Additional comments not posted (6)
cvat/requirements/production.txt (1)
13-13
: Update ofexceptiongroup
to1.2.1
appears safe and beneficial.cvat/apps/dataset_manager/util.py (1)
93-146
: Introduced a robust locking mechanism usingRedlock
. Ensure thorough testing, especially for edge cases like rapid lock acquisition and release.cvat/requirements/base.txt (1)
Line range hint
53-325
: Updates to package versions inbase.txt
are minor and likely include improvements or security patches. Verify compatibility with existing codebase.cvat/apps/dataset_manager/views.py (1)
Line range hint
27-149
: Integration of new locking mechanism and error handling in dataset management views appears well-implemented. Recommend additional testing for robustness.cvat/apps/engine/views.py (2)
2962-2962
: Ensure that the new caching logic in_export_annotations
is correctly implemented and tested.
3022-3022
: Confirm that the error handling inexport_annotations
covers all potential failure scenarios effectively.Verification successful
The output from the revised script provides a comprehensive list of exception handling within the
export_annotations
method and related areas in thecvat/apps/engine/views.py
file. The exceptions handled includeValidationError
,PermissionDenied
,NotFound
,AttributeError
,IntegrityError
,APIException
,ValueError
,CloudStorageModel.DoesNotExist
, and generalException
. This indicates a robust error handling mechanism that covers a wide range of potential issues that could arise during the execution of theexport_annotations
method.Given this information, it appears that the error handling within the
export_annotations
method is indeed comprehensive, covering various specific and general exceptions that could occur. This aligns with the initial review comment's concern about ensuring that all potential failure scenarios are effectively covered.Conclusion:
The error handling in theexport_annotations
method appears to be comprehensive and well-implemented, addressing a variety of error conditions that could potentially disrupt the functionality.
- The method handles specific exceptions like
ValidationError
,PermissionDenied
, andNotFound
, which are crucial for robust API behavior.- It also covers database-related errors through
AttributeError
andIntegrityError
.- General exceptions and unexpected errors are caught by broad
Exception
handling.- Special cases like
CloudStorageModel.DoesNotExist
are also considered, which is important for operations involving external storage.This thorough approach ensures that the method can gracefully handle errors and provide appropriate responses, thereby enhancing the reliability and user experience of the dataset exporting functionalities.
Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Check all branches of error handling in the `export_annotations` method to ensure comprehensive coverage. # Test: Search for error handling branches in the method. Expect: Each branch handles a specific error scenario appropriately. rg --type python $'export_annotations'Length of output: 68
Script:
#!/bin/bash # Description: Re-check all branches of error handling in the `export_annotations` method to ensure comprehensive coverage. # Test: Search for error handling branches in the method. Expect: Each branch handles a specific error scenario appropriately. rg 'export_annotations' --files-with-matches | xargs rg 'except'Length of output: 3895
/check |
❌ Some checks failed |
❌ Some checks failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Out of diff range and nitpick comments (6)
cvat/apps/dataset_manager/util.py (1)
Line range hint
16-41
: Consider adding more detailed docstrings for the utility functions to improve maintainability and clarity for other developers.+ """ + Creates a zip archive from a directory. + + Args: + src_path (str): The source directory path. + dst_path (str): The destination zip file path. + """tests/python/rest_api/test_resource_import_export.py (2)
Line range hint
32-57
: Comprehensive test cases for exporting resources to cloud storage. Consider adding comments to explain the test scenarios in detail for better maintainability.+ """ + Test saving resources to cloud storage with specific configurations. + This test covers various object types and ensures that the export functionality + respects the specified cloud storage locations and settings. + """
Line range hint
182-207
: Import functionality tests are well-covered. Ensure that edge cases, such as network failures or partial data availability, are also tested.Would you like me to help by adding some tests for edge cases involving network failures or partial data availability?
cvat/apps/engine/views.py (3)
Line range hint
44-44
: Remove unused import to clean up the code.- import cvat.apps.dataset_manager.views # pylint: disable=unused-import
Line range hint
1967-1967
: The f-string used here does not contain any placeholders, which is unnecessary. Consider using a regular string if no formatting is needed.- server_address = request.scheme + '://' + server_address = str(request.scheme) + '://'
Line range hint
2714-2714
: Similar to the previous comment, an f-string is used here without placeholders. Use a regular string instead.- filename_pattern = build_annotations_file_name( + filename_pattern = str(build_annotations_file_name(
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (11)
- changelog.d/20240508_144202_mzhiltso_fix_dataset_downloading.md (1 hunks)
- cvat/apps/dataset_manager/default_settings.py (1 hunks)
- cvat/apps/dataset_manager/util.py (2 hunks)
- cvat/apps/dataset_manager/views.py (4 hunks)
- cvat/apps/engine/views.py (4 hunks)
- cvat/requirements/base.in (1 hunks)
- cvat/requirements/base.txt (10 hunks)
- cvat/requirements/development.txt (1 hunks)
- cvat/requirements/testing.txt (1 hunks)
- tests/python/rest_api/test_resource_import_export.py (5 hunks)
- utils/dataset_manifest/requirements.txt (2 hunks)
Files skipped from review due to trivial changes (1)
- cvat/requirements/testing.txt
Files skipped from review as they are similar to previous changes (4)
- cvat/apps/dataset_manager/default_settings.py
- cvat/requirements/base.in
- cvat/requirements/base.txt
- cvat/requirements/development.txt
Additional Context Used
Ruff (3)
cvat/apps/engine/views.py (3)
44-44:
cvat.apps.dataset_manager.views
imported but unused
1967-1967: f-string without any placeholders
2714-2714: f-string without any placeholders
Additional comments not posted (5)
changelog.d/20240508_144202_mzhiltso_fix_dataset_downloading.md (1)
1-5
: Changelog entry is clear and properly linked to the PR.utils/dataset_manifest/requirements.txt (2)
8-9
: Ensure that non-binary flags forlxml
andxmlsec
are required for your project's needs.Verification successful
The comprehensive search across the codebase confirms the consistent use of
--no-binary
flags forlxml
andxmlsec
across various requirement files and Dockerfiles. This indicates a deliberate and justified choice to compile these libraries from source, likely for compatibility or optimization reasons specific to the project's needs. The presence of related development libraries in Dockerfiles further supports this requirement.
- utils/dataset_manifest/requirements.txt: Both
lxml
andxmlsec
are set to compile from source.- cvat/requirements/: Multiple environment-specific files (base, testing, development, production) also enforce non-binary installation.
- Dockerfile: Includes libraries like
libxmlsec1-dev
andlibxmlsec1-openssl
, which are related toxmlsec
, suggesting system-level dependencies that might require source compilation.Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify compatibility of updated package versions. # Test: Search for the package usage in the Python files. Expect: No deprecated functions used. rg --type py 'lxml|xmlsec'Length of output: 28
Script:
#!/bin/bash # Description: Comprehensive search for any references to 'lxml' or 'xmlsec' in the entire codebase. # Search all files for 'lxml' or 'xmlsec' to ensure no usage is missed. rg 'lxml|xmlsec'Length of output: 1026
21-21
: Update totqdm==4.66.4
looks good. Ensure it's tested for compatibility with your project's usage.cvat/apps/dataset_manager/util.py (1)
104-141
: Locking mechanism usingRedlock
is implemented correctly. Ensure thorough testing, especially for edge cases in concurrent environments.cvat/apps/dataset_manager/views.py (1)
179-227
: Robust error handling inclear_export_cache
. Ensure that all possible exceptions are considered and handled appropriately.
… zm/fix-dataset-downloading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Out of diff range and nitpick comments (3)
cvat/apps/engine/views.py (3)
Line range hint
44-44
: Remove unused import to clean up the code.- import cvat.apps.dataset_manager.views # pylint: disable=unused-import
Line range hint
1967-1967
: The f-string is used without placeholders, which is unnecessary. Convert it to a regular string.- instance_timestamp = datetime.strftime(instance_update_time, "%Y_%m_%d_%H_%M_%S") + instance_timestamp = datetime.strftime(instance_update_time, "%Y_%m_%d_%H_%M_%S")
Line range hint
2714-2714
: The f-string is used without placeholders, which is unnecessary. Convert it to a regular string.- instance_timestamp = datetime.strftime(instance_update_time, "%Y_%m_%d_%H_%M_%S") + instance_timestamp = datetime.strftime(instance_update_time, "%Y_%m_%d_%H_%M_%S")
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- cvat/apps/engine/views.py (5 hunks)
Additional Context Used
Ruff (3)
cvat/apps/engine/views.py (3)
44-44:
cvat.apps.dataset_manager.views
imported but unused
1967-1967: f-string without any placeholders
2714-2714: f-string without any placeholders
Additional comments not posted (4)
cvat/apps/engine/views.py (4)
2963-2975
: Ensure proper handling of theinstance_update_time
for projects with multiple tasks. The logic to determine the latest update time across all tasks and the project itself is sound.
Line range hint
2976-3055
: The logic to handle outdated exports and re-enqueue jobs is complex but correctly implements the requirements. It checks if the existing job's data is outdated compared to the instance update time and handles it accordingly. This includes canceling outdated jobs and re-enqueuing them if necessary.
3021-3033
: The logic to handle file existence and re-enqueue jobs if the file does not exist is correctly implemented. It ensures that files are re-exported if they are not found, which is crucial for maintaining data integrity.
3039-3055
: Properly handle failed jobs by providing detailed error information to the client. This is crucial for debugging and understanding why a job failed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- cvat/apps/dataset_manager/tests/test_rest_api_formats.py (9 hunks)
Additional comments not posted (2)
cvat/apps/dataset_manager/tests/test_rest_api_formats.py (2)
17-17
: Ensure proper usage of context managers to handle resources efficiently.Verification successful
Ensure proper usage of context managers to handle resources efficiently
The usage of
ExitStack
andcontextmanager
incvat/apps/dataset_manager/tests/test_rest_api_formats.py
appears to be correct and follows proper resource management practices. Here are the key points:
@contextmanager
Decorators: These are used to define context managers for handling processes and locks, ensuring that resources are properly managed and released.ExitStack
: This is used to manage multiple context managers, ensuring that all resources are properly cleaned up even if an error occurs.The code snippets provided show that these constructs are used appropriately to handle resources efficiently.
Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify the usage of ExitStack and contextmanager in the file to ensure proper resource management. # Test: Search for ExitStack and contextmanager usage. Expect: Proper usage patterns. rg --type python $'ExitStack|contextmanager' cvat/apps/dataset_manager/tests/test_rest_api_formats.pyLength of output: 133
Script:
#!/bin/bash # Description: Verify the usage of ExitStack and contextmanager in the file to ensure proper resource management. # Test: Search for ExitStack and contextmanager usage. Expect: Proper usage patterns. rg 'ExitStack|contextmanager' cvat/apps/dataset_manager/tests/test_rest_api_formats.pyLength of output: 269
Script:
#!/bin/bash # Description: Extract surrounding lines of ExitStack and contextmanager usage for detailed verification. # Extract 5 lines before and after each match to understand the context. rg -C 5 'ExitStack|contextmanager' cvat/apps/dataset_manager/tests/test_rest_api_formats.pyLength of output: 2772
1792-1792
: Ensure proper handling of file usage conflicts during cleanup operations.Verification successful
Ensure proper handling of file usage conflicts during cleanup operations
The search results indicate that
LockNotAvailableError
andFileIsBeingUsedError
are indeed referenced in the test filecvat/apps/dataset_manager/tests/test_rest_api_formats.py
. This suggests that there are mechanisms in place to handle file usage conflicts during cleanup operations.
LockNotAvailableError
is imported and used multiple times.FileIsBeingUsedError
is also imported and used in assertions.These references confirm that the code includes error handling for file usage conflicts.
Scripts Executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify that file usage conflicts are handled correctly in cleanup operations. # Test: Search for conflict handling in cleanup operations. Expect: Proper error handling mechanisms. rg --type python $'LockNotAvailableError|FileIsBeingUsedError' cvat/apps/dataset_manager/tests/test_rest_api_formats.pyLength of output: 151
Script:
#!/bin/bash # Description: Verify that file usage conflicts are handled correctly in cleanup operations. # Test: Search for conflict handling in cleanup operations. Expect: Proper error handling mechanisms. rg --type py $'LockNotAvailableError|FileIsBeingUsedError' cvat/apps/dataset_manager/tests/test_rest_api_formats.pyLength of output: 1084
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #7864 +/- ##
===========================================
+ Coverage 83.35% 83.37% +0.02%
===========================================
Files 380 383 +3
Lines 40032 40171 +139
Branches 3766 3769 +3
===========================================
+ Hits 33367 33492 +125
- Misses 6665 6679 +14
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Outside diff range and nitpick comments (1)
cvat/apps/dataset_manager/views.py (1)
99-129
: Add detailed logging for export operations.Consider adding more detailed logging in the
export
function to provide better visibility into the export process. This could include logging the start and end of the export operation, any errors encountered, and the final status of the export. This would help in debugging and monitoring the export operations.
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- cvat/apps/dataset_manager/tests/test_rest_api_formats.py (9 hunks)
- cvat/apps/dataset_manager/util.py (2 hunks)
- cvat/apps/dataset_manager/views.py (3 hunks)
Additional Context Used
Ruff (1)
cvat/apps/dataset_manager/views.py (1)
29-29:
.util.EXPORT_CACHE_DIR_NAME
imported but unused
Learnings (1)
cvat/apps/dataset_manager/tests/test_rest_api_formats.py (1)
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/tests/test_rest_api_formats.py:1266-1266 Timestamp: 2024-05-17T12:20:07.528Z Learning: The CVAT project prefers using the `attrs` module over `dataclasses` due to its additional features and is commonly used across the project.
Additional comments not posted (4)
cvat/apps/dataset_manager/tests/test_rest_api_formats.py (4)
1266-1266
: Use Python'sdataclasses
module for simplicity and efficiency.- @define + from dataclasses import dataclass - @dataclass class SharedBase:Likely invalid or redundant comment.
1265-1265
: Enhance theExportBehaviorTest
class with additional comments for better understanding.+ # This class tests the behavior of concurrent export and cleanup operations. class ExportBehaviorTest(_DbTestBase):
Likely invalid or redundant comment.
419-419
: Optimize the cleanup of RQ jobs to enhance readability and understanding of its functionality.- self._clear_rq_jobs() # clean up from previous tests and iterations + self.clearResourceQueueJobs() # Clear resource queue jobs from previous tests and iterationsLikely invalid or redundant comment.
97-97
: Consider renaming_DbTestBase
toDatabaseTestBase
for improved readability and adherence to Python naming conventions.- class _DbTestBase(ApiTestBase): + class DatabaseTestBase(ApiTestBase):Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Outside diff range and nitpick comments (1)
cvat/apps/dataset_manager/tests/test_rest_api_formats.py (1)
421-421
: Repeated calls to_clear_rq_jobs
can be optimized.Consider refactoring the repeated calls to
_clear_rq_jobs
across different test methods into a setup or teardown method to reduce redundancy and improve maintainability.Also applies to: 528-528, 615-615, 865-865, 2071-2071, 2153-2153
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- cvat/apps/dataset_manager/tests/test_rest_api_formats.py (9 hunks)
- cvat/apps/dataset_manager/util.py (2 hunks)
Additional Context Used
Learnings (2)
cvat/apps/dataset_manager/util.py (1)
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/util.py:149-157 Timestamp: 2024-05-27T14:31:27.282Z Learning: zhiltsov-max prefers error messages in the `get_export_cache_dir` function to remain less detailed to facilitate manual investigation.
cvat/apps/dataset_manager/tests/test_rest_api_formats.py (2)
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/tests/test_rest_api_formats.py:2061-2061 Timestamp: 2024-05-27T14:34:32.440Z Learning: The project uses a specific naming convention for functions and methods, which should be adhered to in suggestions and comments.
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/tests/test_rest_api_formats.py:1266-1266 Timestamp: 2024-05-17T12:20:07.528Z Learning: The CVAT project prefers using the `attrs` module over `dataclasses` due to its additional features and is commonly used across the project.
Additional comments not posted (14)
cvat/apps/dataset_manager/util.py (2)
149-157
: Review error handling inget_export_cache_dir
.The error handling in
get_export_cache_dir
is appropriate as it raises aFileNotFoundError
when the base directory does not exist. This is crucial for debugging and ensuring that the file operations are performed on valid directories.
106-144
: Review the use of locking mechanism.The implementation of
get_export_cache_lock
usesRedlock
for distributed locking, which is generally a robust choice. However, consider the following:
- Verify that the
auto_release_time
set byttl
is appropriate for the operations being locked.- Ensure that the lock acquisition logic (
block
andacquire_timeout
) is tested under high concurrency to prevent deadlocks or long wait times.cvat/apps/dataset_manager/tests/test_rest_api_formats.py (12)
1267-1267
: TheExportBehaviorTest
class effectively handles concurrency tests.This class is well-structured to test various concurrency issues related to dataset export and cleanup.
1268-1268
: Usage ofattrs
overdataclasses
is consistent with project standards.Good adherence to project conventions by using
attrs
for defining data classes.
1272-1272
: Inheritance fromSharedBase
is correctly implemented inSharedBool
andSharedString
.The inheritance hierarchy is clear and correctly utilizes shared functionality from
SharedBase
.Also applies to: 1284-1284
1298-1298
: Custom exception_LockTimeoutError
is well-defined for specific timeout errors.This custom exception helps in handling specific timeout scenarios clearly within the tests.
1301-1301
: Method overloading forset_condition
is correctly used to handle different types.The use of method overloading here allows for flexible setting of conditions based on variable type, enhancing code robustness.
Also applies to: 1305-1305
1326-1326
: Utility functionsside_effect
andchain_side_effects
are well-implemented for test flexibility.These functions provide a robust way to handle side effects in a controlled manner during testing, which is crucial for accurate concurrency testing.
Also applies to: 1339-1339
1358-1358
: Theprocess_closing
context manager is a critical addition for safely handling process termination.This method ensures that processes are properly closed after tests, preventing potential resource leaks.
1370-1370
: Thetest_concurrent_export_and_cleanup
method is comprehensive in testing race conditions.This method effectively simulates and tests potential race conditions between export and cleanup operations, ensuring robustness in concurrent environments.
1593-1593
: Thetest_concurrent_download_and_cleanup
method thoroughly tests file handling under concurrency.This test method is well-designed to ensure that file download and cleanup operations handle concurrency correctly without data loss or corruption.
1794-1794
: Thetest_export_can_create_file_and_cleanup_job
method correctly tests the export functionality under normal and edge conditions.This method ensures that the export functionality is robust, handling both file creation and subsequent cleanup effectively.
1881-1881
: Thetest_cleanup_can_remove_file
method effectively tests the cleanup functionality of the export system.This method ensures that files are properly removed after their intended lifecycle, preventing unnecessary storage usage.
99-99
: Consider renaming_DbTestBase
toDatabaseTestBase
for improved readability.- class _DbTestBase(ApiTestBase): + class DatabaseTestBase(ApiTestBase):Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (5)
- cvat/apps/dataset_manager/tests/test_rest_api_formats.py (9 hunks)
- cvat/apps/dataset_manager/util.py (2 hunks)
- cvat/requirements/base.txt (13 hunks)
- cvat/requirements/development.txt (3 hunks)
- cvat/requirements/testing.txt (1 hunks)
Files skipped from review due to trivial changes (1)
- cvat/requirements/development.txt
Additional Context Used
Learnings (2)
cvat/apps/dataset_manager/util.py (1)
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/util.py:149-157 Timestamp: 2024-05-27T14:31:27.282Z Learning: zhiltsov-max prefers error messages in the `get_export_cache_dir` function to remain less detailed to facilitate manual investigation.
cvat/apps/dataset_manager/tests/test_rest_api_formats.py (2)
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/tests/test_rest_api_formats.py:2061-2061 Timestamp: 2024-05-27T14:34:32.440Z Learning: The project uses a specific naming convention for functions and methods, which should be adhered to in suggestions and comments.
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/tests/test_rest_api_formats.py:1266-1266 Timestamp: 2024-05-17T12:20:07.528Z Learning: The CVAT project prefers using the `attrs` module over `dataclasses` due to its additional features and is commonly used across the project.
Additional comments not posted (13)
cvat/requirements/testing.txt (3)
9-9
: Updated coverage tool version to 7.2.3.
12-14
: Updated fakeredis and its dependency lupa for testing environments.
14-14
: Updated sortedcontainers to maintain compatibility with the new fakeredis version.cvat/requirements/base.txt (3)
72-72
: Updated Django to 4.2.13, ensuring the application is up-to-date with the latest security patches and bug fixes.
103-103
: Updated django-health-check to 3.18.2 for improved health checking capabilities.
213-214
: Added new dependency, pottery, for enhanced Redis functionalities, likely related to the new locking mechanisms.cvat/apps/dataset_manager/util.py (3)
105-144
: The implementation of Redis locking using the Redlock algorithm inget_export_cache_lock
is correctly handled. Good use of context management and error handling.
149-157
: The functionget_export_cache_dir
correctly checks for the existence of the directory and raises an appropriate error if not found. This aligns with your preference for less detailed error messages to facilitate manual investigation.
190-231
: The functionparse_export_file_path
effectively uses regular expressions to parse the export file path and extract necessary metadata, with robust error handling for parsing failures.cvat/apps/dataset_manager/tests/test_rest_api_formats.py (4)
1268-1268
: Usingattrs
for defining data classes is consistent with project standards. Good use of the library's features for multiprocessing compatibility.
1794-1794
: The methodtest_concurrent_download_and_cleanup
effectively tests TOCTOU issues with file handling. It's a critical test that ensures the robustness of file operations under concurrency.
99-99
: Consider renaming_DbTestBase
toDatabaseTestBase
for better readability and adherence to Python naming conventions.- class _DbTestBase(ApiTestBase): + class DatabaseTestBase(ApiTestBase):Skipped due to learnings
User: zhiltsov-max PR: cvat-ai/cvat#7864 File: cvat/apps/dataset_manager/tests/test_rest_api_formats.py:2061-2061 Timestamp: 2024-05-27T14:34:32.440Z Learning: The project uses a specific naming convention for functions and methods, which should be adhered to in suggestions and comments.
2024-2024
: TheProjectDumpUpload
class tests are comprehensive and cover a wide range of scenarios. Ensure that all new formats added to the system are included in these tests to maintain coverage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review Details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- utils/dataset_manifest/requirements.txt (2 hunks)
Files skipped from review due to trivial changes (1)
- utils/dataset_manifest/requirements.txt
In #7864 the cache cleanup function was updated. The function was not supposed to be called for anything except datasets, but it was called for backups and events. This PR changes these clients to use their own functions. - Fixed `ValueError: Couldn't parse filename components in 'c71eba87-0914-4ccb-b883-a1bf1612fbf8.csv'` errors
<!-- Raise an issue to propose your change (https://github.com/cvat-ai/cvat/issues). It helps to avoid duplication of efforts from multiple independent contributors. Discuss your ideas with maintainers to be sure that changes will be approved and merged. Read the [Contribution guide](https://docs.cvat.ai/docs/contributing/). --> <!-- Provide a general summary of your changes in the Title above --> ### Motivation and context <!-- Why is this change required? What problem does it solve? If it fixes an open issue, please link to the issue here. Describe your changes in detail, add screenshots. --> In #7864, a new file naming scheme was introduced for dataset cache entries, while the old naming convention was deprecated. The old names were parsed incorrectly, leading to failing cache cleanup attempts in `clear_export_cache()`. A test was added in that PR, but it didn't reproduce the old behavior at the full extent. - Fixed old dataset filename parsing (`ValueError: Couldn't parse filename components in 'annotations_cvat-for-images-11.ZIP` errors) - Fixed the test for old cache cleanup ### How has this been tested? <!-- Please describe in detail how you tested your changes. Include details of your testing environment, and the tests you ran to see how your change affects other areas of the code, etc. --> Unit tests. ### Checklist <!-- Go over all the following points, and put an `x` in all the boxes that apply. If an item isn't applicable for some reason, then ~~explicitly strikethrough~~ the whole line. If you don't do that, GitHub will show incorrect progress for the pull request. If you're unsure about any of these, don't hesitate to ask. We're here to help! --> - [ ] I submit my changes into the `develop` branch - [ ] I have created a changelog fragment <!-- see top comment in CHANGELOG.md --> - [ ] I have updated the documentation accordingly - [ ] I have added tests to cover my changes - [ ] I have linked related issues (see [GitHub docs]( https://help.github.com/en/github/managing-your-work-on-github/linking-a-pull-request-to-an-issue#linking-a-pull-request-to-an-issue-using-a-keyword)) - [ ] I have increased versions of npm packages if it is necessary ([cvat-canvas](https://github.com/cvat-ai/cvat/tree/develop/cvat-canvas#versioning), [cvat-core](https://github.com/cvat-ai/cvat/tree/develop/cvat-core#versioning), [cvat-data](https://github.com/cvat-ai/cvat/tree/develop/cvat-data#versioning) and [cvat-ui](https://github.com/cvat-ai/cvat/tree/develop/cvat-ui#versioning)) ### License - [ ] I submit _my code changes_ under the same [MIT License]( https://github.com/cvat-ai/cvat/blob/develop/LICENSE) that covers the project. Feel free to contact the maintainers if that's a concern. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **Bug Fixes** - Improved handling of export file paths to correctly extract `instance_timestamp` and `format_tag` in dataset exports. - Updated test cases to check for the correct export paths, ensuring robust verification of export functionalities. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Boris Sekachev <boris.sekachev@yandex.ru>
Motivation and context
This PR addresses several problems:
This PR fixes the problems by the following:
finished RQ export jobs are not removed automatically on result retrieval. Now, they just use the export cache lifetime instead (should be continued in another PR)How has this been tested?
Checklist
develop
branch(cvat-canvas,
cvat-core,
cvat-data and
cvat-ui)
License
Feel free to contact the maintainers if that's a concern.
Summary by CodeRabbit
New Features
Bug Fixes
Dependencies
cryptography
to42.0.7
django
to4.2.13
django-health-check
to3.18.2
freezegun
to1.5.1
jinja2
to3.1.4
limits
to3.12.0
lxml
to5.2.2
orjson
to3.10.3
pottery
version3.0.0
tqdm
to4.66.4